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Abstract. We propose evolutionary "analysis by synthesis" as a powerful tool in compu- 
tational neuroscience. We present applications of evolution strategies to the adaptation of 
dynamical systems for brain modeling. First, we compare evolutionary and gradient-based 
optimization of dynamic neural fields on an artificial benchmark problem. Then we adjust a 
few-neuron model developed for explaining our recent findings in a ncurobiological experiment, 
in which we studied the processing of temporal sequences of stimuli in the cortex. 
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1. INTRODUCTION 

We propose evolutionary "analysis by synthesis" guided by neurobiological 
knowledge as a powerful tool in computational neuroscience. The challenge 
is to force artificial evolution to favor solutions that are reasonable from the 
biological point of view. Such solutions are only likely to evolve if as much 
neurobiological knowledge as possible is used in the design process. This can 
be achieved by providing sufficient experimental data to evaluate the evolved 
systems and by a deliberate choice of the basic system structure. Additional 
knowledge can be incorporated into the objective function and constraints that 
ensure biological plausibility. 

The first step towards the automated design of biological models is parameter 
adaptation of predefined systems. In the following, we report on some recent 
applications of the covariance matrix adaptation evolution strategy (CMA-ES) 
for parameter optimization of different neurobiological models that describe 
population-coded information processing in the cortex, in particular in terms of 
dynamic neural fields [8] . These models have comparatively few, but meaningful 
parameters (time constants, coupling strengths, etc.). We believe that such kind 
of descriptions are particularly helpful to understand the principles of neural 
computation. Neural fields are a mean-field approach assuming that mainly 
mean anatomical and neurophysiological properties of neuron populations are 
relevant for the information processing. 

In the next section, we briefly review the CMA-ES. Then the two main parts 
of the article follow, a methodological and a biological one: In section 3, we 
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consider the adaptation of dynamic neural fields. Evolutionary and gradient- 
based optimization are compared in a benchmark scenario. In section 4, we 
present an application to the adaption of a few-neuron model. The model is 
fitted to the data of our recent experimental findings regarding processing of 
temporal sequences of tactile stimuli in rat sensory cortex. 



2. THE CMA-ES 

For the adaptation of the real-valued parameters of dynamical systems we use 
the elaborated CMA (covariance matrix adaptation) evolution strategy (CMA- 
ES). A general introduction to evolution strategies can be found in [3], for 
details of the CMA-ES the reader is referred to [11, 12]. 

In the CMA-ES the individuals are m-dimensional real- valued vectors, where 
m is the problem dimension. The offspring are generated by (global interme- 
diate) recombination and subsequent mutation. Here, mutation means adding 
an m-dimensional normally distributed random vector with zero mean. Deter- 
ministic (/i, A)-selection is used, i.e., the best fi of the A offspring form the next 
parent population. 

The CMA-ES implements an efficient and reliable way for adapting the 
covariance matrix of the mutation distribution. Thereby, the algorithm becomes 
invariant under orthogonal transformations of the search space (except for the 
initialization) and can perform efficient optimization with small population 
sizes. The adaptation of the covariance matrix makes use of two important 
concepts: derandomization and cumulation. The former means that the adap- 
tation of the covariance matrix is deterministic after variation and selection of 
the individuals: The mutation steps in the object parameter space that resulted 
in selected offspring are used directly for adapting the covariance matrix such 
that similar steps become more likely. Cumulation means that the path of the 
population over a number of past generations is taken into account for the 
adaptation of the mutation distribution in order to use the information from 
previous generations more efficiently. 



3. ADAPTATION OF DYNAMIC NEURAL FIELDS 

Dynamic neural fields (NFs) have been introduced as models of cortical in- 
formation processing [24, 1] and there is a growing interest in using them for 
biological and psychological modeling as well as for technical applications (e.g., 
see [7, 10, 4, 22, 9]). However, the problem arises how to find suitable parameters 
of NFs for a given task. 

In [14] , we adjusted the parameters of a one-dimensional single-layered NF [1] 
using gradient-based and evolutionary optimization as well as a simple hybrid 
algorithm, which switches from evolutionary to gradient-based optimization 
when the error drops below a predefined threshold. It turned out that gradient- 
based optimization showed better local search performance and that the hybrid 
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strategy combined positive aspects of both evolutionary and gradient-based 
methods (similar to the findings in [2]). 

In the following, we extend our investigation to gradient-based and evo- 
lutionary optimization of two-layered fields. Computation of the gradient of 
NFs is comparatively costly, see section 3.2. Hence, the question arises whether 
the increase in the number of object parameters leads to qualitatively different 
results compared to [14], e.g., to better local search performance of the evolution 
strategy. This hypothesis is supported by theoretical considerations (e.g., in 
[21]) showing that for increasing problem dimension a simple evolution strategy 
outperforms simple gradient-descent (needing m + 1 target function evaluations 
per optimization step, where m is the problem dimension) on simple artificial 
test functions. 

In section 3.1, we introduce the two-layered NF. Then we discuss the cal- 
culation of the partial derivatives of the model. In section 3.3 we present our 
benchmark problem and in 3.4 the results of our empirical comparison. 



3.1. Model 



We study the two-layered NF introduced in [8, 16] for modeling early visual 
information processing in the primary visual cortex of the cat. It is a coupled 
system of integro-differential equations describing an inhibitory and an exci- 
tatory layer of neurons. For a discrete space of n neurons, their dynamics are 
governed by 

n 

T u u x (t) = -u x (t) + ^2 w s( x ~ x')s xl (t) + h 



+ fsh[u x (t)} 



w u (x - x) f u [u x '{t)} - V x (t) 



Lx' = l 



T v V x (t) 



(1) 

(2) 



where u x (t) and v x (t) describe the average membrane potential at time t of a 
neuron located at site x £ {1, . . . , n} of the inhibitory and excitatory layer, re- 
spectively. The external input into the field — the stimulus — is denoted by s x (t). 
The resting potential (the potential in the absence of any input) is determined 
by h. The (membrane) time constants are given by t u and t v . The strengths 
of the couplings are determined by w u (.) = w 9ujCru (.), w v (.) = w 9v:Crv (.), and 
w s (.) = Wg SjrTs (.), which are bell-shaped functions with two parameters each: 
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The firing rate function /„[.] = / a „,^,4[-] and the shunting term / sn [-] 
/ashAhAhl-l are scaled sigmoidal functions with three parameters each: 
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The partial derivatives of f u and / sn with respect to their arguments are denoted 
by fu and / s ' h , respectively. 

Overall, we have m = 15 free parameters gathered in the vector 



3.2. Gradient-based optimization 

The goal of supervised optimization of NFs is to find parameters such that, 
given a stimulus pattern s x (t), the potential u x {t) (or the average activity 
f[u x (t)]) fits data points d x (t), i.e., measured membrane potentials (or firing 
rates), and eventually satisfies neurophysiological constraints. We assume that 
the measurements describe a potential pattern of excitatory neurons over the 
time interval [0,T]. The fit to the data can be achieved by minimizing the 
sum-of-squares error 



JU x=l 

where the masking function fi x (t) is one if a target value at time t and position 
x is available and zero otherwise [6]. 

In order to apply gradient-based optimization methods, we have to calculate 



for each parameter 0j, i = 1, . . . , m. 

The derivatives du x {t)/d9i can be determined by solving the so called vari- 
ation system "forward in time" [6, 18]. As pointed out in [14], methods like 
computing the gradient "backward in time" do not reduce the complexity in case 
of NFs. The variation system that estimates the effect of parameter changes on 
the NF dynamics is derived by differentiating the NF equations (1) and (2) with 
respect to the parameters, cf. [13] . This means, we solve the system of differential 
equations describing du x (t)/d9i for x = 1, . . . , n and i = 1, ..,m. This method 
is also known as real time recurrent learning, forward propagation, variation 
method, or forward propagation. For example, for a parameter Q{ (i 6 {1,2}) of 
the interaction kernel w u we have 
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Figure 1. Artificial target data for the adaptation of the two-layered neural field. The 
spatio-temporal pattern was generated by a "teacher field" defined by Eqs. (f) and (2). 



and 

The initial conditions are given by du x (Q) /dQi = 0 for x = 1, . . . , n and i = 
l,...,m. 

Equations like (7) and (8) have to be solved for each parameter for all 
positions. In this article, all dynamical systems are solved using the forward 
Euler method. Thus, calculating the gradient requires 0(mn 2 ) computations 
per simulated (discrete /Euler) time-step. 

Two problems become obvious: First, the systems has to be, at least piece- 
wise, differentiable w.r.t. the m parameters. For example, if non-differentiable 
transfer functions are used in the model, gradient-based optimization becomes 
problematic. Second, for more complex dynamical systems, computing the par- 
tial derivatives quickly becomes tedious. 



3.3. Experiments 

The task was to reproduce the artificial spatio-temporal pattern shown in Fig. 1, 
which was produced by a "teacher" NF. The number of neurons of the field was 
n = 101 and the parameters were h = —3, t u = t v = 10, a u = 15, a v = 25, 
g u = 195, g v = 250, 0 U = (3 sh = a u = a sh = 1, *& u = i? sh = 0, a s = 10, 
and gs = 60. The time interval was [0, ...,400]. The stimulus was given by 
85o(t) = 1 for 175 < t < 180 and s x (t) = 0 otherwise. Prior to the 400 time 
steps the field was iterated until u x (t) < 10 -6 for all positions x. 

We conducted two sets of experiments. First, the initial parameters were 
randomly chosen from the intervals h 6 [—5,0], t u ,t v 6 [1,10], a u ,a v ,as G 
[3,50], g u ,g v ,g s € [10,300], f3 u ,(3 sh £ [0.5,10], a u ,a sh £ [0.5,2], and tf,tf sh € 
[—0.1,0.1]. Second, the models were initialized ±10% around the optimal val- 
ues. The latter scenario corresponds to fine-tuning an already (e.g., manually) 
adjusted model. 

Before learning and fitness evaluation, respectively, the neurons were initial- 
ized to u x (0) = v x (0) = 0. The first 150 time steps were not considered for 
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fitness and gradient calculation, i.e., we set n x (t) = 0 for t < 150 and fj, x (t) = 1 
otherwise. This is necessary as the models need some time to relax when starting 
from the homogeneous initialization, see [13] for an analysis. 

In the CMA-ES, we set jj, = A and A = 10 and started with a global step size 
of = 0.5 (see [12]). For gradient-based optimization, we used the Broyden- 
Fletcher-Goldfarb-Shanno (BFGS) algorithm, a quasi-Newton method, cf. [20]. 
For the line search, a first-order method was used. Note that both the CMA- 
ES and the BFGS incrementally build up a matrix for second-order informa- 
tion about the (local) error surface (the covariance matrix of the mutation 
distribution and the approximated inverse Hessian, respectively). 

We performed 200 independent trials for each scenario, where the initial 
centers of mass of the parent populations and the starting points of the gradient- 
based optimization were the same. We compared the success rates vs. the 
computational costs of the algorithms. A trial was regarded as successful if 
the sum-of-squares error dropped below a threshold of either 10~ 3 or 10. The 
computational costs were approximated in the following way. One evaluation 
of a NF model, during a line search or for fitness evaluation, corresponded to 
one computational cost unit. The costs of computing the gradient were set to 
m+ 1, see previous section and [14]. 

3.4. Results 

When the NFs were initialized from the larger intervals, the lower threshold was 
not reached by any evolutionary trial and only by a single gradient-based one. 
In case of initialization close to the optimum, the BFGS passed the lower error 
threshold considerably more often. This means, the local search performance 
of the CMA-ES compared to the BFGS is not better than in the case of fewer 
objective parameters (at least for our deliberate choice of the test problem and 
the parameters of the algorithms). 



100 




20 000 



computational costs 



Figure 2. Success rate vs. computational costs. The upper two lines correspond to initialization 
close to the optimum, the lower ones to starting farther away. 
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The success rates for the larger threshold are shown in Fig. 2. Here the 
results are similar to the ones presented for the single-layered NFs in [14]. 
The CMA-ES is slower, but has a higher success rate. The difference is not 
statistically significant for the trials started close to the optimum (x 2 -test after 
computational costs of 20 000, p > 0.1), but for the trials starting farther away 
from the optimum (x 2 -test after computational costs of 20 000, p < 0.005). 
The BFGS gets stuck in local minima more often than the evolution strategy 
(however, it also converges faster to these local optima). 

4. ADAPTION OF A FEW-NEURON MODEL 

In this section we present an application of the previously discussed CMA-ES 
to a real-world computational neuroscience task. In order to reveal mechanisms 
of cortical temporal stimulus processing, we carried out a neurobiological ex- 
periment, developed a model, and fitted the model to the experimental data by 
applying the CMA evolution strategy. 

We employed the CMA-ES for adjusting model parameters for the following 
reasons: First, adjusting the parameters by hand, as it is common in computa- 
tional neuroscience, is usually time-consuming and may not lead to good fits, 
i.e., to models with small fitting errors. Second, we did not want to restrict a 
priori the search for parameter values to regions of desirable, i.e., biologically 
plausible parameter values. This helps judging the model. Obviously, this more 
global search is impossible by hand. Third, we wanted to test if the CMA-ES 
could serve as a a convenient tool for computational neuroscientists, i.e., can 
be used in an out-of-the-box fashion. 

4.1. Neurobiological experiment 

We examined how temporal sequences of sensory stimuli are processed by the 
sensory cortex. We applied trains of three tactile stimuli separated by identical 
interstimulus intervals (ISIs) to the hind paw of rats, and recorded the neural 
stimulus-response activity in the rats' primary somatosensory cortex (SI) [25]. 
The following ISIs were used: 22 ms, 30 ms, 45 ms, 60 ms, 80 ms, 100 ms, 120 ms, 
180 ms, and 240 ms. For each ISI, a mean firing rate (post-stimulus-time his- 
togram) was calculated by averaging the neural response data of all 64 stimulus 
trains for this specific ISI. The maxima of the three response peaks in the mean 
firing rate were read out as the stimulus responses ai iexp , a 2 , exp , and a 3>exp . For 
this study, only the relative response strengths a 2 , exp /ai,expj a 3 ,exp/ai,expj and 
a3,exp/a2,ex P were considered. 

4.2. Model 

In order to understand the biological neural mechanisms of temporal cortical 
stimulus processing, we modeled processing stages of the sensory signal in the 
brain. These include the thalamus and the primary somatosensory cortex which 
receives the signal from the thalamus [17]. In thalamus and cortex, processing 



8 



Schneider et al. 



of the sensory signal is done by populations of functionally similar neurons. Our 
approach follows a basic notion of computational neuroscience: each biological 
neuron population is represented by a single model neuron [23]. Therefore, our 
model consists of one excitatory thalamic neuron and two cortical neurons, 
one excitatory and one inhibitory, cf. Fig. 3. The model constitutes a limit 
case of a neural field, which is not spatially extended but restricted to a single 
location [23, 9]. We also modeled depletion of synaptic transmitter. Thereby, 
we assume that the experimentally observed effects were caused by cortical 
feedback inhibition and synaptic depletion. 

The model is described by the following system of coupled ordinary differ- 
ential equations: 

T e—^- = ~ x e(t) + f\pW s X m (t) - W int Xi(t)} (9) 



dxj(t) 

dt 
dx m (t) 



n — ^ = -xi(t) + f[w int x e (t)] (10) 



' e 



dt 



-x m (t) + F m , lsl (t) (11) 



^ = -c tP (t)F mM t) + 1 -^ • (12) 
at r d 

The variables x e , x«, and x m are the synaptic drives of excitatory, inhibitory, 
and thalamic neuron [19]. r e and Tj denote time constants for excitation and 
inhibition. We set r e = 10 ms, which is neurobiologically plausible, and Tj was 
adjusted by the CMA-ES. io int is the coupling strength between the excitatory 
and the inhibitory neuron determining their interaction. The value of w int was 
adjusted by the CMA-ES. /[.] is a nonlinear firing rate function. We chose 
the linear-threshold function f[x] = max(0,x). The parameter w s denotes the 
feed-forward coupling strength between the thalamic neuron and the excita- 
tory cortical neuron, and was arbitrarily set to = 10. i^isi is the firing 
rate of the thalamic neuron and models the stimulus for the different ISIs. 
^m,isi(*) was chosen to be a stepwise function for each ISI: F mjlsl (t) = 1 for 
t £ ' [0, 15] U [ISI , ISI + 15] U [2 • ISI, 2 • ISI + 15] ms and F m , ISI (t) = 0 otherwise. This 
corresponds to the time course of the application of the physical, tactile stimulus 
to the animals' hind paw. Synaptic depletion is described by the fraction of 
available transmitter, p, the depletion factor q, and the transmitter recovery 
time constant [5]. ct and were adjusted by the CMA-ES. 

For each ISI (stimulus -F m) isi) the differential equations are integrated and 
the maxima of the three response peaks in the excitatory firing rate f[x e (t)] 
are read out as the stimulus responses a 1<mod , a 2>mod , and a 2jmod . These model 
responses are compared with the corresponding experimental responses in the 
form of relative response strengths a 2imod /a limod , a 3imod /a liinod , and a 3 , mod /a 2jmod . 

4.3. Evolutionary optimization 

We applied the CMA-ES for determining the 4 model parameters Tj, u> int , r^, 
and ct for feedback inhibition and synaptic depletion, which we hypothesized 
to be mainly responsible for the experimentally observed effects. The goal of 
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Figure 3. Structure of the model with thalamic neuron m, excitatory cortical neuron e and 
inhibitory cortical neuron i. Excitatory and inhibitory synapses are marked by arrows and 
circles, respectively. Depletion of synaptic transmitter occurs at the synapse between thalamic 
and excitatory cortical neuron. The strength of this synapse is the product p ■ wg of a fixed 
weight wg and the fraction p of available transmitter. 



our optimization was to find values for the four parameters such that the model 
data a 2 , mod /ai,mod and a 3jmod /a ljmod fit the corresponding experimental data over 
all 9 ISIs 1 and also satisfies three constraints: 

1- a 3 ,mod > a 2 ,mod for ISI = 30 ms 

2. a 2 , mod > a 3 , mod for ISI > 100ms 

3. a 2 ,mod, as.mod / 0 for all ISIs . 

These constraints stress features of the experimental data we considered to be 
of particular significance. 

The error function E to be minimized consists of the mean-squared error 
Eq evaluating the deviation of the model data from the experimental data, 
and three additive penalty terms E±, E2, and E% measuring violation of the 
constraints: 



E — Eq + E\ + E2 + E3 

Eq 



1 /°2,mod a 2,exp\ 2 ^ 1 / a 3,mod a 3,exp \ ' 



(13) 
(14) 



-10 



( -^H°£L _ i) + 100 : if a 

\ a 2, mod / 



E2 
E3 



100 


if fl 2,mod 


< «-3,mod for ISI > 100 ms 


0 


else 




100 


if °2,mod 


or a 3jmod = 0 for at least one ISI 


0 


else 





:s ,u„d < a 2 ,mod for ISI = 30 ms (15) 

(16) 

(17) 



The values for the penalty terms were chosen to be much larger than the 
average mean-squared error forcing the optimization to stick to the constraints 
with high priority. A similar procedure has proved to be successful in the case 
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of fitting a neural field model to neurobiological data [15]. In test series of 
optimization trials we found that the simple penalty terms E<i and £3 are 
sufficient for satisfying constraints 2 and 3. However, we also found that an 
identical treatment of constraint 1 is insufficient: the majority of solutions 
violated the constraint. We concluded that there are large connected regions 
in the parameter space containing models that violate constraint 1. Adding a 
constant to the error function only "lifts" the error landscape on these regions. 
Due to the extendedness of the regions a lot of optimization trials start within 
these, and since there is no error gradient concerning the violation of constraint 
1, the trials also get stuck in local optima within the regions. Accordingly, the 
linear penalty term (15) had to be added measuring how strongly the constraint 
is violated. 

We carried out a series of twelve CMA-ES trials minimizing the error function 
(13). The parameter intervals for the search were restricted to reasonable values: 
Ct 6 [0,0.1], w int 6 [0,10], and 7^, Tj G [10,1000] ms. Parameter values that are 
larger or smaller than the boundaries of these intervals do not make sense 
because the corresponding models exhibit total response suppression, or time 
scales of dynamics much longer or shorter than those given by the experimental 
data. The other parameters were set to plausible values beforehand and kept 
fixed, cf. subsection 4.2. Numeric integration of the differential equations was 
done by means of the Euler method with step size 1 ms and initial conditions 
x e (0) = Xi(0) = x m (0) = 0 and p(0) = 1. In the CMA-ES we set p = 1 
and A = 4. The initial parameter values for Tj, io int , r^, and q were randomly 
chosen equally distributed from the intervals described above. We determined 
the fitness of a parameter set by integrating the differential equation system for 
every ISI, cf. subsection 4.2, and evaluating the model data according to the 
error function. 

4.4. Results 

The series of CMA-ES trials yielded several local optima of different quality. 
This indicates the difficulty of the optimization task. The computational costs of 
the single trials were low. In each trial, the CMA-ES settled on a local minimum 
within at most 1000 generations. 

We adopted the solution with the smallest error, E = 0.16, as the solution 
to our model selection problem. It fits the experimental data very well, cf. 
Fig. 4. Furthermore, its parameter values are biologically plausible. The other 
four valid solutions, i.e., solutions satisfying the constraints 1 to 3, fit the ex- 
perimental data clearly worse (E > 0.53). Additionally, their parameter values 
are less plausible than those of the best solution. These relations between the 
quality of the solution and the biological plausibility of the parameter values 
support our modeling approach. 

Interestingly, seven out of the twelve CMA-ES trials violated the first con- 
straint, although we used the penalty term E\ providing a strong error gradient. 
Exploring the reasons for this behavior is one of the aims of our future research. 

For our quite typical application, the CMA-ES proved to be a convenient 
tool. Setting rough boundaries to the parameter search and introducing simple 
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Figure 4- Experimental data (solid line) and model data of the best solution (dashed line). 
ai,a2 and are the neural response strengths to the 1st, 2nd, and 3rd stimulus. 



penalty terms for the constraint was sufficient to find an adequate solution 
within reasonable time. 



5. DISCUSSION AND CONCLUSIONS 

Evolutionary algorithms have proved to be beneficial for the optimization of 
models in computational neuroscience. They can be applied even if the quality 
function is non-differentiable, discontinuous, and noisy. Further, they allow for 
the incorporation of expert knowledge in several ways. In the case where the 
gradient can be derived with reasonable effort, it it should be exploited, e.g., in 
a hybrid algorithm as in [14] , because of the better local search performance of 
gradient-based methods. However, even if the fitness function is differentiable, 
there are reasons to prefer evolutionary optimization, as the derivatives may 
be very complicated and tedious to compute. Evolution strategies can be used 
nearly out-of-the-box and the increase in performance gained by a specialized 
(e.g., gradient-based) method can often be neglected compared to the time 
needed to get the specialized algorithm running. 
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These properties of evolution strategies make them suited for practical ap- 
plications in neuroscience. The process of fitting a model to experimental data 
is strongly accelerated. By enabling a "global" search in parameter space, the 
application of evolution strategies allow for a more thorough understanding of 
the model. The evaluation of the model hypotheses, i.e., how well they explain 
the experimental findings, is improved. The development of a basic model struc- 
ture and evolutionary optimization can be an iterative and interleaving process, 
where the (unsatisfying) outcome of the optimization may lead to a refinement 
of the model. 2 

Regardless of the used optimization method, parameter adaptation of non- 
linear dynamical systems has turned out to be a difficult optimization task, 
where — at least in our examples — the optimization algorithms tend to get trap- 
ped in local minima; bifurcations of the dynamics exacerbate the search process. 
These difficulties can partly be overcome by incorporating additional expert 
knowledge and multi-start strategies, as done in our experiments. 

Notes 

1 The quantity a 3imod /a2,mod was not included in the data comparison because it is derived 
from the other two quantities. 

2 An example of model refinement after evolutionary optimization is the parameterization 
of the kernel function in [15]: The evolved models showed the need for steeper filter charac- 
teristics; after adding degrees of freedom to the kernel function, the evolutionary adaptation 
led to quantitatively better fits. 
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